WebTraceMiner: a web service for processing and mining EST sequence trace files
نویسندگان
چکیده
Expressed sequence tags (ESTs) remain a dominant approach for characterizing the protein-encoding portions of various genomes. Due to inherent deficiencies, they also present serious challenges for data quality control. Before GenBank submission, EST sequences are typically screened and trimmed of vector and adapter/linker sequences, as well as polyA/T tails. Removal of these sequences presents an obstacle for data validation of error-prone ESTs and impedes data mining of certain functional motifs, whose detection relies on accurate annotation of positional information for polyA tails added posttranscriptionally. As raw DNA sequence information is made increasingly available from public repositories, such as NCBI Trace Archive, new tools will be necessary to reanalyze and mine this data for new information. WebTraceMiner (www.conifergdb.org/software/wtm) was designed as a public sequence processing service for raw EST traces, with a focus on detection and mining of sequence features that help characterize 3' and 5' termini of cDNA inserts, including vector fragments, adapter/linker sequences, insert-flanking restriction endonuclease recognition sites and polyA or polyT tails. WebTraceMiner complements other public EST resources and should prove to be a unique tool to facilitate data validation and mining of error-prone ESTs (e.g. discovery of new functional motifs).
منابع مشابه
High Fuzzy Utility Based Frequent Patterns Mining Approach for Mobile Web Services Sequences
Nowadays high fuzzy utility based pattern mining is an emerging topic in data mining. It refers to discover all patterns having a high utility meeting a user-specified minimum high utility threshold. It comprises extracting patterns which are highly accessed in mobile web service sequences. Different from the traditional fuzzy approach, high fuzzy utility mining considers not only counts of mob...
متن کاملStudy on Various Web Mining Functionalities using Web Log Files
As the size of web increases along with number of users, it is very much essential for the website owners to better understand their customers so that they can provide better service, and also enhance the quality of the website. To achieve this they depend on the web access log files. The web access log files can be mined to extract interesting pattern so that the user behavior can be understoo...
متن کاملCompetition, complementarity and service level guarantee in Web services
Network and processing overhead associated with web services is a significant challenge to its performance. As a result, web service providers often announce a service level agreement. This ensures that consumers, who pay for the service, can get the service at a given quality level. In this paper, we study the competition between two providers offering functionally the same web services, whe...
متن کاملESMP: A high-throughput computational pipeline for mining SSR markers from ESTs
UNLABELLED With the advent of high-throughput sequencing technology, sequences from many genomes are being deposited to public databases at a brisk rate. Open access to large amount of expressed sequence tag (EST) data in the public databases has provided a powerful platform for simple sequence repeat (SSR) development in species where sequence information is not available. SSRs are markers of ...
متن کاملDistribution of trace elements in coal and coal fly ash and their recovery with mineral processing practices: A review
Today coal is among the most important energy sources. In order to meet the world's energy demands, low-calorie lignite with a high ash content is generally used in the large capacity coal-fired thermal power plants. As a result of coal firing, wastes such as fly ash, slag, and flue gas are also produced. Subsequently, toxic trace elements within coal are transferred to wastes such as slag, fl...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره 35 شماره
صفحات -
تاریخ انتشار 2007